-
Notifications
You must be signed in to change notification settings - Fork 412
Modular handshake #494
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Modular handshake #494
Conversation
…r, and only expose old peer_channel_encryptor.rs to fuzz tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly just skimmed it but left some notes for things I came across. The high level API looks good (modulo some "we should move, like, everything out of ln::" thinking, but I am worried about overuse of Vec in a bunch of places (malloc is more expensive in surprising ways than you think, really), and needs docs.
lightning/src/ln/peers/conduit.rs
Outdated
@@ -0,0 +1,173 @@ | |||
use ln::peers::{chacha, hkdf}; | |||
|
|||
/// Returned after a successful handshake to encrypt and decrypt communication with peer nodes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add more details about how you'd use this, what the requirements are for use, etc? Same goes for pretty much everything pub in this PR.
@@ -0,0 +1,34 @@ | |||
use util::chacha20poly1305rfc::ChaCha20Poly1305RFC; | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this in ln::peers? It seems to be pure crypto functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because the AEAD-based encryption methods are only used for handshakes and peer message encryption IIRC, and not for the onion construction.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, but its also a pure-crypto primitive. I guess my preference is for such things (even if it implements a lightning protocol crypto primitive) to be in some kind of crypto module.
lightning/src/ln/peers/conduit.rs
Outdated
let length = buffer.len() as u16; | ||
let length_bytes = length.to_be_bytes(); | ||
|
||
let encrypted_length = chacha::encrypt(&self.sending_key, self.sending_nonce as u64, &[0; 0], &length_bytes); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oof. Lets avoid allocating a Vec for two bytes somehow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you sure this is the right place to shave off memory usage at the expense of legibility? I'm a bit hesitant to do the whole decrypt in place approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I dont think it changes legibility, and its not a memory usage question, its a question of heap fragmentation and significant performance penalty to hit the heap. I'm dubious you can't make it readable without using a Vec.
lightning/src/ln/peers/conduit.rs
Outdated
} | ||
|
||
pub(super) fn read(&mut self, data: &[u8]) { | ||
let mut read_buffer = if let Some(buffer) = self.read_buffer.take() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would probably be more readable with https://doc.rust-lang.org/std/option/enum.Option.html#method.get_or_insert
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ooooh, TIL. There are a bunch of places where I will replace that.
} | ||
|
||
impl Act { | ||
pub fn serialize(&self) -> Vec<u8> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we name this something else? I almost responded with "why the hell would you ever want to serialize an Act for storage?" before I realized it meant "write act to a vec to send to a peer" (also, can we not return a slice here?).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps to_vec
? We could return a slice, but would have to mark its lifetime as linked to the act object's.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's wrong with using lifetimes? The callsite can chose to store it in a Vec or copy to some local buffer if they wont want it.
@@ -0,0 +1,398 @@ | |||
use bitcoin_hashes::{Hash, HashEngine}; | |||
use bitcoin_hashes::sha256::Hash as Sha256; | |||
use rand::{Rng, thread_rng}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't support rand because it a) often breaks MSRV, and b) doesn't support lots of platforms (read:embedded) that we do, where RNG access is via some proprietary API. Any random values you want need to be passed in from the user.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I'll try to refactor it to have the user pass in entropy.
I'm afraid not, because some versions of Rust are confused about variable scopes with macro invocation. |
But for what it's worth, the only file that I am really modifying that existed prior is peer_handler.rs. And this diff is gonna impact it pretty severely one way or another. |
Actually, I may have been a bit confused here. The primary cause of all unrelated diffs is the indentation change. Could you introduce scopes to prevent this? FYI, this comment was suppose to be on |
@@ -501,266 +531,214 @@ impl<Descriptor: SocketDescriptor, CM: Deref> PeerManager<Descriptor, CM> where | |||
} | |||
} | |||
|
|||
macro_rules! insert_node_id { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why remove this? We would no longer insert into the map without it, AFAICT. This breaks other code that needs the map to look up peers by id.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
None of the unit tests are failing, but I will re-add it to the graph, and then think of unit tests for this in a separate diff
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, this code isn't well tested but most (all?) of the public methods depend on that map.
Honestly, I think the indentation change might be because I moved the macro definitions. But I stand by the moves :) |
lightning/src/ln/peer_handler.rs
Outdated
let mut features = InitFeatures::supported(); | ||
if self.message_handler.route_handler.should_request_full_sync(&peer.their_node_id.unwrap()) { | ||
features.set_initial_routing_sync(); | ||
match message { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you insert three scopes around this, the diffs become saner.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only looked at peer_handler.rs so far, though I do have a good sense of the other code from looking at it earlier. I like the direction this is moving in!
Could you squash some of these commits? See guidelines about atomic commits. Some squashes could wait until the end of the review for sure. But at very least separate commits for making it compile with Rust 1.22.0 are not relevant to the code review. Likewise for the separate commits used to remove the extraneous diffs.
Also, please include the why not just the what in your commit messages (when applicable) and PR description for benefit of reviewers and posterity. See the best practices for writing commit messages that are linked in the guidelines.
lightning/src/ln/peer_handler.rs
Outdated
@@ -596,33 +591,32 @@ impl<Descriptor: SocketDescriptor, CM: Deref> PeerManager<Descriptor, CM> where | |||
log_trace!(self, "Received message of type {} from {}", message.type_id(), log_pubkey!(peer.their_node_id.unwrap())); | |||
|
|||
// Need an Init as first message | |||
if let wire::Message::Init(_) = message { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry to be nit-picky about this, but we need to aim for atomic commits for the reasons mentioned in our contributing guidelines. All the changes from here to the end of the method appear unrelated to your previous changes and aren't necessary.
I'm all improving formatting, but I'd prefer if this were done at one time when we don't think it will be disruptive. Doing so then will also make the blame history easier to reason about.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm sorry, I can't see these changes anymore due to squashing. Was it an indentation change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All changes from the following line to the end of the method:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got most but still a few unrelated changes at lines 591 to 593 and line 755.
@@ -501,266 +531,214 @@ impl<Descriptor: SocketDescriptor, CM: Deref> PeerManager<Descriptor, CM> where | |||
} | |||
} | |||
|
|||
macro_rules! insert_node_id { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, this code isn't well tested but most (all?) of the public methods depend on that map.
lightning/src/ln/peer_handler.rs
Outdated
let mut handshake = PeerHandshake::new(&self.our_node_secret, &self.get_ephemeral_key()); | ||
handshake.make_inbound(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than calling a separate method after new
, how about replacing new
with make_inbound
and make_outbound
constructors?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or rather new_inbound and new_outbound, sure
lightning/src/ln/peer_handler.rs
Outdated
|
||
if let &mut PeerState::Handshake(ref mut handshake) = &mut peer.encryptor { | ||
let (response, conduit, remote_pubkey) = handshake.process_act(&peer.pending_read_buffer, None).unwrap(); | ||
peer.pending_read_buffer.drain(..); // we read it all |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment looks like a code smell. The drain introduces a dependency between PeerManager
and PeerHandshake
. That is, one module needs to know how the other is implemented. Could you remove this coupling by passing the data to process_act
using move semantics rather than by reference?
That said, I need to consider how decrypt
is used. We can discuss further offline if you'd like.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I've got an idea for this. Add a method to handshake to read only one message, but store the buffer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh wait, that's already happening
lightning/src/ln/peer_handler.rs
Outdated
} | ||
|
||
if let Some(conduit) = conduit_option { | ||
// Rust 1.22 does not allow assignment in a borrowed context, even if mutable |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I follow what you were trying to do as expressed by this comment. What statement wasn't compiling for you?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This statement could have been done inside the big if clause preceding these single line assignments.
lightning/src/ln/peer_handler.rs
Outdated
macro_rules! encode_and_send_msg { | ||
($msg: expr) => { | ||
{ | ||
log_trace!(self, "Encoding and sending message of type {} to {}", $msg.type_id(), log_pubkey!(peer.their_node_id.unwrap())); | ||
peer.pending_outbound_buffer.push_back(peer.channel_encryptor.encrypt_message(&encode_msg!(&$msg)[..])); | ||
// we are in a context where conduit is known |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you use an assert instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it doesn't make sense because the other variables are assumed to be known, too. Once this is refactored from a macro into a regular method, it will solve itself.
lightning/src/ln/peer_handler.rs
Outdated
if let Some(key) = remote_pubkey_option { | ||
peer.their_node_id = Some(key); | ||
} | ||
|
||
if let Some(conduit) = conduit_option { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't all this be done within the PeerState::Handshake
block above? Then you wouldn't need conduit_option
and remote_pubkey_option
there with the de-structured moves in the if let
statements above. You would just place this code there instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not in rust 1.22
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, yeah for conduit at least. At very least you shouldn't have destruct the Option
when assigning to conduit_option
. You can simply unconditionally assign:
conduit_option = conduit;
peer.pending_outbound_buffer.push_back(peer.channel_encryptor.encrypt_message(&encode_msg!(msg))); | ||
if let PeerState::Connected(ref mut conduit) = peer.encryptor { | ||
peer.pending_outbound_buffer.push_back(conduit.encrypt(&encode_msg!(msg))); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we have a peer in node_id_to_descriptor
, it should also have a conduit. Any chance this can be done within get_peer_for_forwarding
so the repetition across the method could be removed? We probably want to assert this as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Absolutely, though here too, I think that it will be much clearer once get_peer_for_forwarding is refactored from being a macro, which I strongly believe it doesn't need to be.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't whether it is a macro or method orthogonal to whether the duplication should be removed?
FWIW, my refactor touches this method so avoiding all the merge conflict where this change is duplicate (~16 places) would be ideal. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tend to agree the redundancy of all these calls suck. Happy for y'all to hash out which PR it gets fixed in, though.
/// Returned after a successful handshake to encrypt and decrypt communication with peer nodes. | ||
/// It should not normally be manually instantiated. | ||
/// Automatically handles key rotation. | ||
/// For decryption, it is recommended to call `decrypt_message_stream` for automatic buffering. | ||
pub struct Conduit { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The word "conduit" is another name for "channel", which I assume you are trying not to overload. That said, nothing about this struct's behaviors fit the definition of those words. That is, the struct is not responsible for transferring (flowing) data from a source to a destination; the caller does that. Rather, it is simply responsible for encrypting and decrypting data that is transferred.
Further, the fields of the struct are essentially divided into analogous "sending" and "receiving" fields used for encrypting and decrypting respectively. And these fields are disjoint. Therefore, I'd recommend dividing this struct into two structs called Encryptor
and Decryptor
. The common associated functions can be moved to the module-level. Then, each struct has a single responsibility and there is no need to distinguish between types of keys and nonces (and mistakenly using the wrong one).
Then your PeerState
enum can become:
enum PeerState {
Authenticating(PeerHandshake),
Connected(Encryptor, Decryptor),
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes perfect sense and I would love to do that, though what would you call the module?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about simply "encryption.rs"? Or you could break them into "encryptor.rs" and "decryptor.rs" with key rotation utilities living elsewhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe. But I'd reserve it for a separate PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I propose to use name Transcoder
lightning/src/ln/peers/conduit.rs
Outdated
pub(crate) sending_key: [u8; 32], | ||
pub(crate) receiving_key: [u8; 32], | ||
|
||
pub(crate) sending_chaining_key: [u8; 32], | ||
pub(crate) receiving_chaining_key: [u8; 32], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you make constants for these? Or reuse some from secp256k1::constants
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These aren't constants as much as they are types, but definitely.
lightning/src/ln/peers/conduit.rs
Outdated
let length = buffer.len() as u16; | ||
let length_bytes = byte_utils::be16_to_array(length); | ||
|
||
let mut ciphertext = vec![0u8; 18 + length as usize + 16]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Likewise, with regard to constants.
lightning/src/ln/peers/conduit.rs
Outdated
/// Add newly received data from the peer node to the buffer and decrypt all possible messages | ||
pub fn decrypt_message_stream(&mut self, new_data: Option<&[u8]>) -> Vec<Vec<u8>> { | ||
let mut read_buffer = if let Some(buffer) = self.read_buffer.take() { | ||
buffer | ||
} else { | ||
Vec::new() | ||
}; | ||
|
||
if let Some(data) = new_data { | ||
read_buffer.extend_from_slice(data); | ||
} | ||
|
||
let mut messages = Vec::new(); | ||
|
||
loop { | ||
// todo: find way that won't require cloning the entire buffer | ||
let (current_message, offset) = self.decrypt(&read_buffer[..]); | ||
if offset == 0 { | ||
break; | ||
} | ||
|
||
read_buffer.drain(0..offset); | ||
|
||
if let Some(message) = current_message { | ||
messages.push(message); | ||
} else { | ||
break; | ||
} | ||
} | ||
|
||
messages | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't look like it used outside of a test. Let's leave it out until it is needed. It would be better to review when the use case is present.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a better idea. It will be in the next revision.
lightning/src/ln/peers/conduit.rs
Outdated
return (None, 0); | ||
} | ||
|
||
let encrypted_length = &buffer[0..18]; // todo: abort if too short |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like this TODO was addressed above, right?
lightning/src/ln/peers/conduit.rs
Outdated
|
||
fn increment_nonce(nonce: &mut u32, chaining_key: &mut [u8; 32], key: &mut [u8; 32]) { | ||
*nonce += 1; | ||
if *nonce == 1000 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a constant for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah, you meant the 1,000
lightning/src/ln/peers/conduit.rs
Outdated
for _ in 0..1002 { | ||
let encrypted_message = encrypted_messages.remove(0); | ||
let mut decrypted_messages = remote_peer.decrypt_message_stream(Some(&encrypted_message)); | ||
assert_eq!(decrypted_messages.len(), 1); | ||
let decrypted_message = decrypted_messages.remove(0); | ||
assert_eq!(decrypted_message, hex::decode("68656c6c6f").unwrap()); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like it is testing a different behavior so it should probably be a different test.
lightning/src/ln/peers/conduit.rs
Outdated
assert_eq!(encrypted_messages[0], hex::decode("cf2b30ddf0cf3f80e7c35a6e6730b59fe802473180f396d88a8fb0db8cbcf25d2f214cf9ea1d95").unwrap()); | ||
assert_eq!(encrypted_messages[1], hex::decode("72887022101f0b6753e0c7de21657d35a4cb2a1f5cde2650528bbc8f837d0f0d7ad833b1a256a1").unwrap()); | ||
assert_eq!(encrypted_messages[500], hex::decode("178cb9d7387190fa34db9c2d50027d21793c9bc2d40b1e14dcf30ebeeeb220f48364f7a4c68bf8").unwrap()); | ||
assert_eq!(encrypted_messages[501], hex::decode("1b186c57d44eb6de4c057c49940d79bb838a145cb528d6e8fd26dbe50a60ca2c104b56b60e45bd").unwrap()); | ||
assert_eq!(encrypted_messages[1000], hex::decode("4a2f3cc3b5e78ddb83dcb426d9863d9d9a723b0337c89dd0b005d89f8d3c05c52b76b29b740f09").unwrap()); | ||
assert_eq!(encrypted_messages[1001], hex::decode("2ecd8c8a5629d0d02ab457a0fdd0f7b90a192cd46be5ecb6ca570bfc5e268338b1a16cf4ef2d36").unwrap()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like this is testing two different behaviors: chaining and key rotation? Ideally, tests should only test one behavior. Could we split this up? I realize this is the test data from BOLT 4, so happy to hear the argument if you think it should be one big test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right, it is testing both. Arguably the key rotation is part of the encryption. I can split it up into two tests if you like, but in a separate commit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you test decryption in itself ? +1 for splitting into multiple tests and documenting what exactly is tested.
@@ -0,0 +1,38 @@ | |||
/// Wrapper for the first act message |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to get a better sense as to when separate files should be used for some of these smaller structs. In particular, handshake/acts.rs, handshake/hash.rs, and handshake/states.rs are used almost exclusively in handshake/mod.rs. And they constitute less than 100 lines total. What criteria did you used to split these out? Do they deserve separate sub-modules when the structs within them are themselves named using their sub-module name and are not independently tested?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Frankly, just easier navigation through the components in the file system.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given these are small and either are implementation details or form the public interface of handshake/mod.rs, I'd prefer if they were all in the same file. Otherwise, navigation between files is a bit excessive, IMHO.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer to not stuff the handshake logic with all the accessory structs. I would be open to having one separate file where both the enums/states and the acts would live?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see them as accessories but rather core parts of the handshake module's interface (acts.rs) and implementation (states.rs). The former requires the user to include two things when using the interface. The latter requires the reader to reference another file when reading the implementation details.
I'd concede that an argument can be made for keeping hash.rs as a separate file since it is encapsulating functionality.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For what it's worth, so does the Act enum
let remote_pubkey_vec = chacha::decrypt(&act_three_expectation.temporary_key, 1, &hash.value, &tagged_encrypted_pubkey)?; | ||
let mut remote_pubkey_bytes = [0u8; 33]; | ||
remote_pubkey_bytes.copy_from_slice(remote_pubkey_vec.as_slice()); | ||
// todo: replace unwrap with handleable error type |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you address this TODO?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, in a subsequent commit
|
||
let mut ephemeral_public_key_bytes = [0u8; 33]; | ||
ephemeral_public_key_bytes.copy_from_slice(&act_bytes[1..34]); | ||
// todo: replace unwrap with handleable error type |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you address this TODO?
|
||
/// Process act dynamically | ||
/// The role must be set before this method can be called | ||
pub fn process_act(&mut self, input: &[u8], remote_public_key: Option<&PublicKey>) -> Result<(Vec<u8>, Option<Conduit>, Option<PublicKey>), String> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some idle thoughts on this interface (process_act
) and that of process_act_two
and process_act_three
:
Rather than returning tuples of Option
s in process_act
and the non-Option
values in the other two methods, could they (i.e., PublicKey
and Conduit
) instead be set as fields in the appropriate act states (without using Option
)? Then accessor methods on PeerHandshake
could be provided for retrieving them (using Option
). That way the code is a bit more self-documented and the return values of the act processing methods would be simpler (i.e., just returning the next Act to send).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, the act states are only what's necessary for the next step of the handshake. I think the handshake can be thought of as a chemical reaction, where there is an educt, a product, as well as some external inputs and byproducts. The handshake states are educts and products, and the conduit is a byproduct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What are the acts and the remote public key in this analogy?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Discussed a bit offline. One idea is to have PeerHandshake
initialized with the optional remote public key, so that it is not both an input and an output of process_act
. And then have an accessor for accessing it. Similarly with the conduit.
Then the only return value would be the act data which could even be the Act
enum instead given it simply wraps the vector, which would make the interface more explicit.
fix last lint doc issue? make the &Some matches explicit for Rust 1.22 appease Rust 1.22 some more with ampersandery appease Rust 1.22 by using byte_utils for endianness functionality appease Rust 1.22 by using byte_utils and ref in match arms experimenting some more with ref matching for Rust 1.22 might Rust 1.22 finally work? Rearranged a lot of borrowing locations and macro behavior and definition locations fix bug that was kindly caught by fuzz tests fix fuzz test improve error messaging the different rust versions are a balancing act, and I can't juggle
4508a58
to
2c8e38d
Compare
…stants, and type definitions respond to Jeff's comments and add node pubkeys to map simplify review by means of 3 scopes adjust scope indentation resolve merge conflicts respond to more of Jeff's comments constantify key rotation index for conduit
2c8e38d
to
8169b31
Compare
… handshake method `process_act` across both call sites from peer_handler.rs.
Travis seems to have forgotten about this |
lightning/src/ln/peers/conduit.rs
Outdated
|
||
/// Decrypt a single message. Buffer is an undelimited amount of bytes | ||
fn decrypt(&mut self, buffer: &[u8]) -> (Option<Vec<u8>>, usize) { // the response slice should have the same lifetime as the argument. It's the slice data is read from | ||
if buffer.len() < 18 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you constify or document here than a short read less than encrypted_message_length+MAC is invalid ?
lightning/src/ln/peers/conduit.rs
Outdated
length_bytes.copy_from_slice(length_vec.as_slice()); | ||
let message_length = u16::from_be_bytes(length_bytes) as usize; | ||
|
||
let message_end_index = message_length + 18; // todo: abort if too short |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is 16-byte of MAC of the Lightning message counted in message length ? I'm not sure as l=len(m) computed before message encryption so shouldn't be 18 + message_length + 16 here?
lightning/src/ln/peers/conduit.rs
Outdated
assert_eq!(encrypted_messages[0], hex::decode("cf2b30ddf0cf3f80e7c35a6e6730b59fe802473180f396d88a8fb0db8cbcf25d2f214cf9ea1d95").unwrap()); | ||
assert_eq!(encrypted_messages[1], hex::decode("72887022101f0b6753e0c7de21657d35a4cb2a1f5cde2650528bbc8f837d0f0d7ad833b1a256a1").unwrap()); | ||
assert_eq!(encrypted_messages[500], hex::decode("178cb9d7387190fa34db9c2d50027d21793c9bc2d40b1e14dcf30ebeeeb220f48364f7a4c68bf8").unwrap()); | ||
assert_eq!(encrypted_messages[501], hex::decode("1b186c57d44eb6de4c057c49940d79bb838a145cb528d6e8fd26dbe50a60ca2c104b56b60e45bd").unwrap()); | ||
assert_eq!(encrypted_messages[1000], hex::decode("4a2f3cc3b5e78ddb83dcb426d9863d9d9a723b0337c89dd0b005d89f8d3c05c52b76b29b740f09").unwrap()); | ||
assert_eq!(encrypted_messages[1001], hex::decode("2ecd8c8a5629d0d02ab457a0fdd0f7b90a192cd46be5ecb6ca570bfc5e268338b1a16cf4ef2d36").unwrap()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you test decryption in itself ? +1 for splitting into multiple tests and documenting what exactly is tested.
} | ||
|
||
impl PeerHandshake { | ||
pub fn new(private_key: &SecretKey, ephemeral_private_key: Option<&SecretKey>) -> Self { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't PeerHandshake get a KeysInterface trait and access node key secret through it ? Maybe not in this PR but you should let a TODO about this, all ecdh operations with key acess should be made behind this.
If we do have to do any operation with private key in-memory, we should document somewhere to zero'out the memory after we don't need them anymore, at PeerHandshake destruction I'm not sure if memory is cleaned
|
||
let (act_three, mut conduit) = self.process_act_two(ActTwo(act_two_buffer))?; | ||
|
||
if self.read_buffer.len() > 0 { // have we received more data still? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find this a bit of a blur between handshake phase and transport phase, we already have the offset with act_length, can't we use this to drive conduit reading in a further step ? And avoid clone.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I hear you, but haven't really figured out a nice alternative yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nevermind, I figure out after pursing review than I agree it's not that much easy
lightning/src/ln/peer_handler.rs
Outdated
() => { | ||
match peers.node_id_to_descriptor.entry(peer.their_node_id.unwrap()) { | ||
hash_map::Entry::Occupied(_) => { | ||
log_trace!(self, "Got second connection with {}, closing", log_pubkey!(peer.their_node_id.unwrap())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: precise you're closing both connections in the log_trace
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The log entry is just getting moved; changing it will also require changing the no breakage unit test, which I strongly wish to avoid in such a massive refactor.
5155970
to
d84dcc8
Compare
d84dcc8
to
a4fff76
Compare
} | ||
|
||
if peer.pending_read_buffer_pos == peer.pending_read_buffer.len() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
During the handshake pending_read_buffer
is initialized to an act length vector filled with 0s. So, until the full act arrives this branch will not be taken. This PR returns an Err
if the process_act()
doesn't receive the full act and results in a PeerDataProcessingDecision::Disconnect
.
|
||
let encrypted_length = &buffer[0..TAGGED_MESSAGE_LENGTH_HEADER_SIZE]; | ||
let mut length_bytes = [0u8; MESSAGE_LENGTH_HEADER_SIZE]; | ||
length_bytes.copy_from_slice(&chacha::decrypt(&self.receiving_key, self.receiving_nonce as u64, &[0; 0], encrypted_length).unwrap()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Found this bug while adding to the fuzz testing today. It should return an error here in the event of a decryption error instead of panicking. This is a bit involved since the Iterator
implementation now needs to return Result
objects, but it is doable.
After reviewing lightningdevkit#494 there was design feedback on the use of Vec vs. arrays in this layer of the code for fragmentation reasons. To get ahead of the same issues in this new state machine and to limit the behavior changes from master, remove Vec from the state machine in favor of thin wrappers around fixed-size arrays. This patch reintroduces the Act object (a thin wrapper around fixed size arrays) with some convenience features to make them a bit easier to pass around and build. The Handshake code is still note Vec-clean as the encryption code uses Vecs during encryption, but it is one step closer to passing back slices to the peer_handler.
After reviewing lightningdevkit#494 there was design feedback on the use of Vec vs. arrays in this layer of the code for fragmentation reasons. To get ahead of the same issues in this new state machine and to limit the behavior changes from master, remove Vec from the state machine in favor of thin wrappers around fixed-size arrays. This patch reintroduces the Act object (a thin wrapper around fixed size arrays) with some convenience features to make them a bit easier to pass around and build. The Handshake code is still note Vec-clean as the encryption code uses Vecs during encryption, but it is one step closer to passing back slices to the peer_handler.
@julianknutsen @arik-so How do we move forward with this PR and "Enum Dispatch NOISE State Machine" ? Should we focus review on this PR first then open a rebased "Enum Dispatch NOISE State Machine" ? Or "Enum Dispatch NOISE State" should supersede it with a new PR ? |
I think the nature of the NOISE PR updating the branch that will then update the modular handshake PR automatically requires that we focus on the NOISE one first. |
I agree. I think that the end of this is going to look like a "feature branch" represented by https://github.com/arik-so/rust-lightning/tree/modular_handshake and a set of PRs against that branch that address some of the outstanding review comments as well as this handshake state machine to get it into a point where the modular handshake PR is ready to go. There are still a few outstanding action items from this original PR that I am addressing in future PRs targetting ariks's branch so it seems like vetting code into that branch and then a final PR from his branch into master is the path of least resistance. Still going to be quite a bit of work, but that is just the nature of long-standing dependent PRs. If there is additional legwork you would like me to do (rebasing, merge PRs, etc) to make it easy just let me know. I know that this wasn't planned work so I want to make it as easy as possible to integrate the changes and additional testing. |
Okay I'm even on this, will review the NOISE PR against Arik's repo so. At least as we track back review context. |
After reviewing lightningdevkit#494 there was design feedback on the use of Vec vs. arrays in this layer of the code for fragmentation reasons. To get ahead of the same issues in this new state machine and to limit the behavior changes from master, remove Vec from the state machine in favor of thin wrappers around fixed-size arrays. This patch reintroduces the Act object (a thin wrapper around fixed size arrays) with some convenience features to make them a bit easier to pass around and build. The Handshake code is still note Vec-clean as the encryption code uses Vecs during encryption, but it is one step closer to passing back slices to the peer_handler.
Superseded by PR stack ending at #694 |
After reviewing lightningdevkit#494 there was design feedback on the use of Vec vs. arrays in this layer of the code for fragmentation reasons. To get ahead of the same issues in this new state machine and to limit the behavior changes from master, remove Vec from the state machine in favor of thin wrappers around fixed-size arrays. This patch reintroduces the Act object (a thin wrapper around fixed size arrays) with some convenience features to make them a bit easier to pass around and build. The Handshake code is still note Vec-clean as the encryption code uses Vecs during encryption, but it is one step closer to passing back slices to the peer_handler.
After reviewing lightningdevkit#494 there was design feedback on the use of Vec vs. arrays in this layer of the code for fragmentation reasons. To get ahead of the same issues in this new state machine and to limit the behavior changes from master, remove Vec from the state machine in favor of thin wrappers around fixed-size arrays. This patch reintroduces the Act object (a thin wrapper around fixed size arrays) with some convenience features to make them a bit easier to pass around and build. The Handshake code is still note Vec-clean as the encryption code uses Vecs during encryption, but it is one step closer to passing back slices to the peer_handler.
After reviewing lightningdevkit#494 there was design feedback on the use of Vec vs. arrays in this layer of the code for fragmentation reasons. To get ahead of the same issues in this new state machine and to limit the behavior changes from master, remove Vec from the state machine in favor of thin wrappers around fixed-size arrays. This patch reintroduces the Act object (a thin wrapper around fixed size arrays) with some convenience features to make them a bit easier to pass around and build. The Handshake code is still note Vec-clean as the encryption code uses Vecs during encryption, but it is one step closer to passing back slices to the peer_handler.
Closing, as it's superseded. |
After reviewing lightningdevkit#494 there was design feedback on the use of Vec vs. arrays in this layer of the code for fragmentation reasons. To get ahead of the same issues in this new state machine and to limit the behavior changes from master, remove Vec from the state machine in favor of thin wrappers around fixed-size arrays. This patch reintroduces the Act object (a thin wrapper around fixed size arrays) with some convenience features to make them a bit easier to pass around and build. The Handshake code is still note Vec-clean as the encryption code uses Vecs during encryption, but it is one step closer to passing back slices to the peer_handler.
After reviewing lightningdevkit#494 there was design feedback on the use of Vec vs. arrays in this layer of the code for fragmentation reasons. To get ahead of the same issues in this new state machine and to limit the behavior changes from master, remove Vec from the state machine in favor of thin wrappers around fixed-size arrays. This patch reintroduces the Act object (a thin wrapper around fixed size arrays) with some convenience features to make them a bit easier to pass around and build. The Handshake code is still note Vec-clean as the encryption code uses Vecs during encryption, but it is one step closer to passing back slices to the peer_handler.
After reviewing lightningdevkit#494 there was design feedback on the use of Vec vs. arrays in this layer of the code for fragmentation reasons. To get ahead of the same issues in this new state machine and to limit the behavior changes from master, remove Vec from the state machine in favor of thin wrappers around fixed-size arrays. This patch reintroduces the Act object (a thin wrapper around fixed size arrays) with some convenience features to make them a bit easier to pass around and build. The Handshake code is still note Vec-clean as the encryption code uses Vecs during encryption, but it is one step closer to passing back slices to the peer_handler.
Refactor peer channel encryptor into a whole separate, isolated module and abstract away most of the internals of handshake handling from the peer handler